Automated Information Extraction from Web APIs Documentation
نویسندگان
چکیده
A fundamental characteristic of Web APIs is the fact that, de facto, providers hardly follow any standard practices while implementing, publishing, and documenting their APIs. As a consequence, the discovery and use of these services by third parties is significantly hampered. In order to achieve further automation while exploiting Web APIs we present an approach for automatically extracting relevant technical information from the Web pages documenting them. In particular we have devised two algorithms that automatically extract technical details such as operation names, operation descriptions or URI templates from the documentation of Web APIs adopting either RPC or RESTful interfaces. The algorithms devised, which exploit advanced DOM processing as well as state of the art Information Extraction and Natural Language Processing techniques, have been evaluated against a detailed dataset exhibiting a high precision and recall–around 90% for both REST and RPC APIs–outperforming state of the art information extraction algorithms.
منابع مشابه
Towards open services on the Web : a semantic approach
Knowledge Media Institute (KMi) Doctor of Philosophy in Computer Science by Dipl.-Inform. Maria Maleshkova The World Wide Web (WWW) has significantly evolved since it was first released as a publicly available service on the Internet, developing from a collection of a few interlinked static pages to a global ubiquitous platform for sharing, searching and browsing dynamic and customisable conten...
متن کاملHarnessing the Crowds for Automating the Identification of Web APIs
Supporting the efficient discovery and use of Web APIs is increasingly important as their use and popularity grows. Yet, a simple task like finding potentially interesting APIs and their related documentation turns out to be hard and time consuming even when using the best resources currently available on the Web. In this paper we describe our research towards an automated Web API documentation...
متن کاملThe Information Gathering Strategies of API Learners
API users experience significant difficulties when learning how to use APIs, but little is known about the strategies used to overcome these difficulties, the motivation for each strategy, or the trade-offs between the strategies. To better understand the information seeking strategies of API users, we conducted a study in which 20 participants were asked to complete programming tasks using unf...
متن کاملAutomatically Extracting Web API Specifications from HTML Documentation
Web API specifications are machine-readable descriptions of APIs. These specifications, in combination with related tooling, simplify and support the consumption of APIs. However, despite the increased distribution of web APIs, specifications are rare and their creation and maintenance heavily relies on manual efforts by third parties. In this paper, we propose an automatic approach and an asso...
متن کاملFrom Software APIs to Web Service Ontologies: A Semi-automatic Extraction Method
Successful employment of semantic web services depends on the availability of high quality ontologies to describe the domains of these services. As always, building such ontologies is difficult and costly, thus hampering web service deployment. Our hypothesis is that since the functionality offered by a web service is reflected by the underlying software, domain ontologies could be built by ana...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012